Nonlinear Emotional Prosody Generation and Emotional Tags
نویسندگان
چکیده
The paper analyzes the prosody features, which includes the intonation, speaking rate, intensity, based on classified emotional speech. As an important feature of voice quality, voice source are also deduced for analysis. With the analysis results above, the paper creates both a CART model and a weight decay neural network model to find acoustic importance towards the emotional speech classification and to disclose whether there is an underlying consistency between acoustic features and speech emotion. The result shows the proposed method can obtain the importance of each acoustic feature through its weight for emotional speech classification and further improve the emotional speech classification.
منابع مشابه
Nonlinear Emotional Prosody Generation and Annotation1
Emotion is an important element in expressive speech synthesis. The paper makes the brief analysis on prosody parameters, stresses, rhythms and paralinguistic information in different emotional speech, and labels the speech with rich annotation information in multi-layers. Then, a CART model is used to do the emotional prosody generation. Unlike the traditional linear modification method, which...
متن کاملAffective and sensorimotor components of emotional prosody generation.
Although advances have been made regarding how the brain perceives emotional prosody, the neural bases involved in the generation of affective prosody remain unclear and debated. Two models have been forged on the basis of clinical observations: a first model proposes that the right hemisphere sustains production and comprehension of emotional prosody, while a second model proposes that emotion...
متن کاملEmotional Voice Conversion Using Neural Networks with Different Temporal Scales of F0 based on Wavelet Transform
An artificial neural network is one of the most important models for training features of voice conversion (VC) tasks. Typically, neural networks (NNs) are very effective in processing nonlinear features, such as mel cepstral coefficients (MCC) which represent the spectrum features. However, a simple representation for fundamental frequency (F0) is not enough for neural networks to deal with an...
متن کاملEmotional Voice Conversion with Adaptive Scales F0 Based on Wavelet Transform Using Limited Amount of Emotional Data
Deep learning techniques have been successfully applied to speech processing. Typically, neural networks (NNs) are very effective in processing nonlinear features, such as mel cepstral coefficients (MCC), which represent the spectrum features in voice conversion (VC) tasks. Despite these successes, the approach is restricted to problems with moderate dimension and sufficient data. Thus, in emot...
متن کاملThe emotional paradox: dissociation between explicit and implicit processing of emotional prosody in schizophrenia.
People with schizophrenia show well-replicated deficits on tasks of explicit recognition of emotional prosody. However it remains unclear whether they are still sensitive to the implicit cues of emotional prosody, particularly when they exhibit high levels of social anhedonia. A dual processing model suggesting a dissociation between the neural networks involved in explicit and implicit recogni...
متن کامل